CrowdTruth for Sparse Multiple Choice Tasks: Event Extraction

In this tutorial, we will apply CrowdTruth metrics to a sparse multiple choice crowdsourcing task for Event Extraction from sentences. The workers were asked to read a sentence and then pick from a multiple choice list which are the words or words phrases in the sentence that are events or actions. The options available in the multiple choice list change with the input sentence. The task was executed on FigureEight. For more crowdsourcing annotation task examples, click here.

In this tutorial, we will also show how to translate an open task to a closed task by processing both the input units and the annotations of a crowdsourcing task, and how this impacts the results of the CrowdTruth quality metrics. We start with an open-ended extraction task, where the crowd was asked to read a sentence and then pick from a multiple choice list which are the words or words phrases in the sentence that are events or actions.

To replicate this experiment, the code used to design and implement this crowdsourcing annotation template is available here: template, css, javascript.

This is a screenshot of the task as it appeared to workers:

A sample dataset for this task is available in this file, containing raw output from the crowd on FigureEight. Download the file and place it in a folder named data that has the same root as this notebook. Now you can check your data:



In [1]:

    
import pandas as pd

test_data = pd.read_csv("../data/event-text-sparse-multiple-choice.csv")
test_data.head()









    Out[1]:







  
    
      
      _unit_id
      _created_at
      _id
      _started_at
      _tainted
      _channel
      _trust
      _worker_id
      _country
      _region
      ...
      events_count
      original_sentence
      processed_sentence
      selectedtags_desc_gold
      sentence
      sentence_id
      stanford_lemmas
      stanford_pos_tags
      tokens
      validate_verbs
    
  
  
    
      0
      1883297207
      8/31/2018 08:18:12
      4019711384
      8/31/2018 08:11:44
      False
      clixsense
      1
      6481150
      AUS
      8
      ...
      4
      Separately, Esselte Business Systems reported ...
      Separately , Esselte Business Systems reported...
      NaN
      NaN
      11
      NaN
      NaN
      39
      NaN
    
    
      1
      1883297207
      8/31/2018 21:59:16
      4021335631
      8/31/2018 21:58:58
      False
      gifthunterclub
      1
      43861575
      USA
      NaN
      ...
      4
      Separately, Esselte Business Systems reported ...
      Separately , Esselte Business Systems reported...
      NaN
      NaN
      11
      NaN
      NaN
      39
      NaN
    
    
      2
      1883297214
      8/30/2018 12:57:05
      4016914193
      8/30/2018 12:56:45
      False
      clixsense
      1
      31988217
      GBR
      P5
      ...
      5
      But some other parties and social organization...
      But some other parties and social organization...
      NaN
      NaN
      10
      NaN
      NaN
      39
      NaN
    
    
      3
      1883297214
      8/31/2018 12:36:21
      4020056124
      8/31/2018 12:35:42
      False
      instagc
      1
      23503585
      CAN
      SK
      ...
      5
      But some other parties and social organization...
      But some other parties and social organization...
      NaN
      NaN
      10
      NaN
      NaN
      39
      NaN
    
    
      4
      1883297214
      8/30/2018 15:00:13
      4017194252
      8/30/2018 14:59:21
      False
      prodege
      1
      11131207
      CAN
      ON
      ...
      5
      But some other parties and social organization...
      But some other parties and social organization...
      NaN
      NaN
      10
      NaN
      NaN
      39
      NaN
    
  

5 rows × 26 columns

Declaring a pre-processing configuration

The pre-processing configuration defines how to interpret the raw crowdsourcing input. To do this, we need to define a configuration class. First, we import the default CrowdTruth configuration class:



In [2]:

    
import crowdtruth
from crowdtruth.configuration import DefaultConfig

Our test class inherits the default configuration DefaultConfig, while also declaring some additional attributes that are specific to the Relation Extraction task:

inputColumns: list of input columns from the .csv file with the input data
outputColumns: list of output columns from the .csv file with the answers from the workers
annotation_separator: string that separates between the crowd annotations in outputColumns
open_ended_task: boolean variable defining whether the task is open-ended (i.e. the possible crowd annotations are not known beforehand, like in the case of free text input); in the task that we are processing, workers pick the answers from a pre-defined list, therefore the task is not open ended, and this variable is set to False
annotation_vector: list of possible crowd answers, mandatory to declare when open_ended_task is False; for our task, this is the list of all relations that were given as input to the crowd in at least one sentence
processJudgments: method that defines processing of the raw crowd data; for this task, we process the crowd answers to correspond to the values in annotation_vector

The complete configuration class is declared below:



In [3]:

    
class TestConfig(DefaultConfig):
    inputColumns = ["doc_id", "events", "events_count", "original_sentence", "processed_sentence", "sentence_id", "tokens"]
    outputColumns = ["selected_events"]
    
    annotation_separator = ","
        
    # processing of a closed task
    open_ended_task = True
    
    def processJudgments(self, judgments):
        # pre-process output to match the values in annotation_vector
        for col in self.outputColumns:
            # transform to lowercase
            judgments[col] = judgments[col].apply(lambda x: str(x).lower())
            # remove square brackets from annotations
            judgments[col] = judgments[col].apply(lambda x: str(x).replace('[',''))
            judgments[col] = judgments[col].apply(lambda x: str(x).replace(']',''))
            # remove the quotes around the annotations
            judgments[col] = judgments[col].apply(lambda x: str(x).replace('"',''))
        return judgments

Pre-processing the input data

After declaring the configuration of our input file, we are ready to pre-process the crowd data:



In [4]:

    
data_open, config = crowdtruth.load(
    file = "../data/event-text-sparse-multiple-choice.csv",
    config = TestConfig()
)

data_open['judgments'].head()









    Out[4]:







  
    
      
      output.selected_events
      output.selected_events.count
      output.selected_events.unique
      submitted
      started
      worker
      unit
      duration
      job
    
    
      judgment
      
      
      
      
      
      
      
      
      
    
  
  
    
      4019711384
      {u'$ 10.1__129__135': 1, u'reported__38__46': ...
      4
      4
      2018-08-31 08:18:12
      2018-08-31 08:11:44
      6481150
      1883297207
      388
      ../data/event-text-sparse-multiple-choice
    
    
      4021335631
      {u'$ 10.1__129__135': 1}
      1
      1
      2018-08-31 21:59:16
      2018-08-31 21:58:58
      43861575
      1883297207
      18
      ../data/event-text-sparse-multiple-choice
    
    
      4016914193
      {u'accession__100__109': 1, u'bring__174__179'...
      2
      2
      2018-08-30 12:57:05
      2018-08-30 12:56:45
      31988217
      1883297214
      20
      ../data/event-text-sparse-multiple-choice
    
    
      4020056124
      {u'accession__100__109': 1, u'bring__174__179'...
      2
      2
      2018-08-31 12:36:21
      2018-08-31 12:35:42
      23503585
      1883297214
      39
      ../data/event-text-sparse-multiple-choice
    
    
      4017194252
      {u'accession__100__109': 1, u'claiming__80__88...
      3
      3
      2018-08-30 15:00:13
      2018-08-30 14:59:21
      11131207
      1883297214
      52
      ../data/event-text-sparse-multiple-choice

Computing the CrowdTruth metrics

The pre-processed data can then be used to calculate the CrowdTruth metrics:



In [6]:

    
results_open = crowdtruth.run(data_open, config)

results is a dict object that contains the quality metrics for sentences, events and crowd workers.

The sentence metrics are stored in results["units"]:



In [7]:

    
results_open["units"].head()









    Out[7]:







  
    
      
      duration
      input.doc_id
      input.events
      input.events_count
      input.original_sentence
      input.processed_sentence
      input.sentence_id
      input.tokens
      job
      output.selected_events
      output.selected_events.annotations
      output.selected_events.unique_annotations
      worker
      uqs
      unit_annotation_score
      uqs_initial
      unit_annotation_score_initial
    
    
      unit
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1883297207
      80.90
      wsj_1033.tml
      $ 10.1__129__135###reported__38__46###fell__72...
      4
      Separately, Esselte Business Systems reported ...
      Separately , Esselte Business Systems reported...
      11
      39
      ../data/event-text-sparse-multiple-choice
      {u'$ 9.5__86__91': 1, u'reported__38__46': 18,...
      35
      4
      20
      0.811172
      {u'reported__38__46': 0.946635621845, u'$ 10.1...
      0.731037
      {u'reported__38__46': 0.9, u'$ 10.1__129__135'...
    
    
      1883297208
      62.70
      APW19990607.0041.tml
      purports__5__13###be__17__19###said__179__183#...
      5
      Kopp purports to be a devout Roman Catholic, a...
      Kopp purports to be a devout Roman Catholic , ...
      14
      39
      ../data/event-text-sparse-multiple-choice
      {u'no_event': 1, u'be__17__19': 4, u'said__179...
      50
      6
      20
      0.565027
      {u'no_event': 0.0312696271625, u'be__17__19': ...
      0.471512
      {u'no_event': 0.05, u'be__17__19': 0.2, u'said...
    
    
      1883297209
      49.65
      NYT19981025.0216.tml
      protect__45__52###murdered__77__85###said__97_...
      7
      ``We as Christians have a responsibility to pr...
      `` We as Christians have a responsibility to p...
      14
      39
      ../data/event-text-sparse-multiple-choice
      {u'want__177__181': 4, u'have__20__24': 4, u'm...
      63
      6
      20
      0.649039
      {u'want__177__181': 0.246818574648, u'have__20...
      0.561513
      {u'want__177__181': 0.2, u'have__20__24': 0.2,...
    
    
      1883297210
      63.65
      NYT19981026.0446.tml
      opposed__170__177###followed__122__130###was__...
      5
      Slepian's death was among the first topics rai...
      Slepian 's death was among the first topics ra...
      16
      39
      ../data/event-text-sparse-multiple-choice
      {u'opposed__170__177': 8, u'followed__122__130...
      48
      5
      20
      0.614187
      {u'opposed__170__177': 0.52740577667, u'follow...
      0.518003
      {u'opposed__170__177': 0.4, u'followed__122__1...
    
    
      1883297211
      39.60
      NYT19981026.0446.tml
      exploit__109__116###murder__133__139###said__2...
      5
      ``It's possible that New York politics has nev...
      `` It 's possible that New York politics has n...
      43
      39
      ../data/event-text-sparse-multiple-choice
      {u'exploit__109__116': 12, u'murder__133__139'...
      52
      5
      20
      0.653157
      {u'exploit__109__116': 0.705336530504, u'murde...
      0.589756
      {u'exploit__109__116': 0.6, u'murder__133__139...

The uqs column in results["units"] contains the sentence quality scores, capturing the overall workers agreement over each sentence. Here we plot its histogram:



In [8]:

    
import matplotlib.pyplot as plt
%matplotlib inline

plt.hist(results_open["units"]["uqs"])
plt.xlabel("Sentence Quality Score")
plt.ylabel("Sentences")









    Out[8]:





Text(0,0.5,u'Sentences')

The unit_annotation_score column in results["units"] contains the sentence-relation scores, capturing the likelihood that a relation is expressed in a sentence. For each sentence, we store a dictionary mapping each relation to its sentence-relation score.



In [9]:

    
results_open["units"]["unit_annotation_score"].head(10)









    Out[9]:





unit
1883297207    {u'reported__38__46': 0.946635621845, u'$ 10.1...
1883297208    {u'no_event': 0.0312696271625, u'be__17__19': ...
1883297209    {u'want__177__181': 0.246818574648, u'have__20...
1883297210    {u'opposed__170__177': 0.52740577667, u'follow...
1883297211    {u'exploit__109__116': 0.705336530504, u'murde...
1883297212    {u'returned__139__147': 0.849240421016, u'shot...
1883297213    {u'murder__80__86': 0.720750633636, u'curious_...
1883297214    {u'claiming__80__88': 0.700462062784, u'no_eve...
1883297215    {u'wars__214__218': 0.766478442225, u'stabiliz...
1883297216    {u'buildup__18__25': 0.592988447005, u'thrust_...
Name: unit_annotation_score, dtype: object

The worker metrics are stored in results["workers"]:



In [10]:

    
results_open["workers"].head()









    Out[10]:







  
    
      
      duration
      job
      judgment
      unit
      wqs
      wwa
      wsa
      wqs_initial
      wwa_initial
      wsa_initial
    
    
      worker
      
      
      
      
      
      
      
      
      
      
    
  
  
    
      1883983
      34.500000
      1
      6
      6
      0.794872
      0.820860
      0.968341
      0.725088
      0.756548
      0.958417
    
    
      3587109
      11.000000
      1
      2
      2
      0.627709
      0.749908
      0.837048
      0.513732
      0.648339
      0.792383
    
    
      4316379
      24.000000
      1
      3
      3
      0.514893
      0.688889
      0.747424
      0.383277
      0.559880
      0.684569
    
    
      6377879
      64.666667
      1
      6
      6
      0.573349
      0.695490
      0.824381
      0.498032
      0.620033
      0.803234
    
    
      6481150
      98.047619
      1
      42
      42
      0.728838
      0.776141
      0.939054
      0.662856
      0.710559
      0.932866

The wqs columns in results["workers"] contains the worker quality scores, capturing the overall agreement between one worker and all the other workers.



In [27]:

    
plt.hist(results_open["workers"]["wqs"])
plt.xlabel("Worker Quality Score")
plt.ylabel("Workers")









    Out[27]:





Text(0,0.5,u'Workers')

Open to Closed Task Transformation

The goal of this crowdsourcing task is to understand how clearly a word or a word phrase is expressing an event or an action across all the sentences in the dataset and not at the level of a single sentence as previously. Therefore, in the remainder of this tutorial we show how to translate an open task to a closed task by processing both the input units and the annotations of a crowdsourcing task.

The answers from the crowd are stored in the selected_events column.



In [28]:

    
test_data["selected_events"][0:30]









    Out[28]:





0     ["$ 10.1__129__135","reported__38__46","fell__...
1                                  ["$ 10.1__129__135"]
2             ["accession__100__109","bring__174__179"]
3             ["accession__100__109","bring__174__179"]
4     ["accession__100__109","claiming__80__88","bri...
5     ["accession__100__109","claiming__80__88","bri...
6     ["accession__100__109","claiming__80__88","bri...
7            ["accession__100__109","claiming__80__88"]
8            ["accession__100__109","claiming__80__88"]
9            ["accession__100__109","claiming__80__88"]
10           ["accession__100__109","claiming__80__88"]
11           ["accession__100__109","claiming__80__88"]
12                              ["accession__100__109"]
13                              ["accession__100__109"]
14             ["analyzed__82__90","distort__204__211"]
15    ["analyzed__82__90","Replied__0__7","distort__...
16                                 ["analyzed__82__90"]
17                                 ["analyzed__82__90"]
18                                 ["analyzed__82__90"]
19                                 ["analyzed__82__90"]
20    ["announced__7__16","closed__52__58","request_...
21    ["announced__7__16","closed__52__58","request_...
22    ["announced__7__16","closed__52__58","request_...
23    ["announced__7__16","closed__52__58","request_...
24    ["announced__7__16","closed__52__58","request_...
25    ["announced__7__16","closed__52__58","request_...
26    ["announced__7__16","closed__52__58","request_...
27    ["announced__7__16","closed__52__58","request_...
28                ["announced__7__16","closed__52__58"]
29               ["announced__7__16","request__27__34"]
Name: selected_events, dtype: object

As you already know, each word can be expressed in a canonical form, i.e., as a lemma. For example, the words: run, runs, running, they all have the lemma run. As you can see in the previous cell, events in text can appear under multiple forms. To evaluate the clarity of each event, we will process both the input units and the crowd annotations to refer to a word in its canonical form, i.e., we will lemmatize them.

Following, we define the function used to lemmatize the options that are shown to the workers in the crowdsourcing task:



In [29]:

    
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet

def nltk2wn_tag(nltk_tag):
    if nltk_tag.startswith('J'):
        return wordnet.ADJ
    elif nltk_tag.startswith('V'):
        return wordnet.VERB
    elif nltk_tag.startswith('N'):
        return wordnet.NOUN
    elif nltk_tag.startswith('R'):
        return wordnet.ADV
    else:          
        return None
    
def lemmatize_events(event):
    nltk_tagged = nltk.pos_tag(nltk.word_tokenize(str(event.lower().split("__")[0])))  
    wn_tagged = map(lambda x: (str(x[0]), nltk2wn_tag(x[1])), nltk_tagged)
    res_words = []
                
    for word, tag in wn_tagged:
        if tag is None:            
            res_word = wordnet._morphy(str(word), wordnet.NOUN)
            if res_word == []:
                res_words.append(str(word))
            else:
                if len(res_word) == 1:
                    res_words.append(str(res_word[0]))
                else:
                    res_words.append(str(res_word[1]))
        else:
            res_word = wordnet._morphy(str(word), tag)
            if res_word == []:
                res_words.append(str(word))
            else: 
                if len(res_word) == 1:
                    res_words.append(str(res_word[0]))
                else:
                    res_words.append(str(res_word[1]))
    
    lematized_keyword = " ".join(res_words)
    return lematized_keyword









    



[nltk_data] Downloading package punkt to /Users/oanainel/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /Users/oanainel/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/oanainel/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!



In [9]:

    
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet

def nltk2wn_tag(nltk_tag):
    if nltk_tag.startswith('J'):
        return wordnet.ADJ
    elif nltk_tag.startswith('V'):
        return wordnet.VERB
    elif nltk_tag.startswith('N'):
        return wordnet.NOUN
    elif nltk_tag.startswith('R'):
        return wordnet.ADV
    else:          
        return None









    



[nltk_data] Downloading package punkt to /Users/oanainel/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /Users/oanainel/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/oanainel/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!

The following functions create the values of the annotation vector and extracts the lemma of the events selected by each worker.



In [30]:

    
def define_annotation_vector(eventsList):  
    events = []
    for i in range(len(eventsList)):
        currentEvents = eventsList[i].split("###")
        
        for j in range(len(currentEvents)):
            if currentEvents[j] != "no_event":
                lematized_keyword = lemmatize_events(currentEvents[j])
                
                if lematized_keyword not in events:
                    events.append(lematized_keyword)
    events.append("no_event")   
    return events

def lemmatize_keywords(keywords, separator):
    keywords_list = keywords.split(separator)
    lematized_keywords = []
    
    for keyword in keywords_list:
        lematized_keyword = lemmatize_events(keyword)
        lematized_keywords.append(lematized_keyword)
    
    return separator.join(lematized_keywords)



In [31]:

    
class TestConfig(DefaultConfig):
    inputColumns = ["doc_id", "events", "events_count", "original_sentence", "processed_sentence", "sentence_id", "tokens"]
    outputColumns = ["selected_events"]
    
    annotation_separator = ","
        
    # processing of a closed task
    open_ended_task = False
    annotation_vector = define_annotation_vector(test_data["events"])
    
    def processJudgments(self, judgments):
        # pre-process output to match the values in annotation_vector
        for col in self.outputColumns:
            # transform to lowercase
            judgments[col] = judgments[col].apply(lambda x: str(x).lower())
            # remove square brackets from annotations
            judgments[col] = judgments[col].apply(lambda x: str(x).replace("[",""))
            judgments[col] = judgments[col].apply(lambda x: str(x).replace("]",""))
            # remove the quotes around the annotations
            judgments[col] = judgments[col].apply(lambda x: str(x).replace('"',''))
            judgments[col] = judgments[col].apply(lambda x: lemmatize_keywords(str(x), self.annotation_separator))
        return judgments



In [32]:

    
data_closed, config = crowdtruth.load(
    file = "data/event-text-sparse-multiple-choice.csv",
    config = TestConfig()
)

data_closed['judgments'].head()









    Out[32]:







  
    
      
      output.selected_events
      output.selected_events.count
      output.selected_events.unique
      submitted
      started
      worker
      unit
      duration
      job
    
    
      judgment
      
      
      
      
      
      
      
      
      
    
  
  
    
      4019711384
      {u'$ 10.1': 1, u'report': 1, u'fall': 1, u'$ 9...
      4
      143
      2018-08-31 08:18:12
      2018-08-31 08:11:44
      6481150
      1883297207
      388
      data/event-text-sparse-multiple-choice
    
    
      4021335631
      {u'$ 10.1': 1, u'report': 0, u'fall': 0, u'$ 9...
      1
      143
      2018-08-31 21:59:16
      2018-08-31 21:58:58
      43861575
      1883297207
      18
      data/event-text-sparse-multiple-choice
    
    
      4016914193
      {u'accession': 1, u'bring': 1, u'$ 10.1': 0, u...
      2
      143
      2018-08-30 12:57:05
      2018-08-30 12:56:45
      31988217
      1883297214
      20
      data/event-text-sparse-multiple-choice
    
    
      4020056124
      {u'accession': 1, u'bring': 1, u'$ 10.1': 0, u...
      2
      143
      2018-08-31 12:36:21
      2018-08-31 12:35:42
      23503585
      1883297214
      39
      data/event-text-sparse-multiple-choice
    
    
      4017194252
      {u'accession': 1, u'claim': 1, u'bring': 1, u'...
      3
      143
      2018-08-30 15:00:13
      2018-08-30 14:59:21
      11131207
      1883297214
      52
      data/event-text-sparse-multiple-choice



In [36]:

    
results_closed = crowdtruth.run(data_closed, config)



In [37]:

    
results_closed["annotations"]









    Out[37]:







  
    
      
      output.selected_events
      aqs
      aqs_initial
    
  
  
    
      $ 10.1
      840
      3.819172e-02
      5.263158e-02
    
    
      $ 9.5
      840
      1.000000e-08
      1.000000e-08
    
    
      accession
      840
      8.321034e-01
      6.842105e-01
    
    
      add
      840
      1.936832e-01
      2.105263e-01
    
    
      analyze
      840
      8.530456e-01
      7.894737e-01
    
    
      announce
      840
      8.670445e-01
      7.368421e-01
    
    
      appear
      840
      2.689285e-01
      2.105263e-01
    
    
      approve
      840
      8.577278e-01
      7.894737e-01
    
    
      arrest
      840
      8.741370e-01
      7.894737e-01
    
    
      assassination
      840
      6.594177e-01
      5.789474e-01
    
    
      assume
      840
      4.833141e-01
      4.210526e-01
    
    
      attempt
      840
      6.151563e-01
      4.824561e-01
    
    
      barricade
      840
      6.589291e-01
      5.789474e-01
    
    
      be
      840
      3.914025e-01
      3.840941e-01
    
    
      become
      840
      4.191707e-01
      3.932584e-01
    
    
      believe
      840
      4.162580e-01
      3.157895e-01
    
    
      block
      840
      4.960013e-01
      3.684211e-01
    
    
      bogged
      840
      2.190501e-01
      1.578947e-01
    
    
      boost
      840
      7.905818e-01
      7.368421e-01
    
    
      bring
      840
      4.243983e-01
      3.157895e-01
    
    
      buildup
      840
      5.694491e-01
      4.736842e-01
    
    
      bury
      840
      6.571132e-01
      5.263158e-01
    
    
      call
      840
      4.532265e-01
      3.684211e-01
    
    
      camped
      840
      8.497850e-01
      7.894737e-01
    
    
      casualty
      840
      4.454066e-01
      3.684211e-01
    
    
      cause
      840
      3.774561e-01
      2.631579e-01
    
    
      change
      840
      8.329362e-01
      7.894737e-01
    
    
      claim
      840
      6.825544e-01
      6.315789e-01
    
    
      close
      840
      8.306430e-01
      6.997085e-01
    
    
      come
      840
      2.111922e-01
      1.578947e-01
    
    
      ...
      ...
      ...
      ...
    
    
      say
      840
      6.100785e-01
      5.540057e-01
    
    
      see
      840
      5.103717e-01
      4.210526e-01
    
    
      sent
      840
      6.743174e-01
      5.263158e-01
    
    
      settle
      840
      8.276415e-01
      7.368421e-01
    
    
      shot
      840
      9.240686e-01
      8.421053e-01
    
    
      stabilize
      840
      6.390495e-01
      5.789474e-01
    
    
      statement
      840
      1.440308e-01
      1.052632e-01
    
    
      suffer
      840
      6.074152e-01
      5.263158e-01
    
    
      suit
      840
      2.744135e-01
      2.105263e-01
    
    
      support
      840
      5.642443e-01
      5.263158e-01
    
    
      take
      840
      5.161874e-01
      4.210526e-01
    
    
      talk
      840
      7.083928e-01
      5.789474e-01
    
    
      target
      840
      2.147324e-01
      2.631579e-01
    
    
      thrust
      840
      6.053758e-01
      5.263158e-01
    
    
      tighten
      840
      5.397438e-01
      4.736842e-01
    
    
      told
      840
      4.597396e-01
      3.684211e-01
    
    
      transaction
      840
      6.720130e-01
      5.789474e-01
    
    
      treatment
      840
      4.764945e-01
      4.394904e-01
    
    
      truth
      840
      1.000000e-08
      1.000000e-08
    
    
      tumult
      840
      4.540955e-01
      3.684211e-01
    
    
      upheld
      840
      9.323898e-01
      8.421053e-01
    
    
      use
      840
      2.653169e-01
      2.105263e-01
    
    
      vindicate
      840
      5.528458e-01
      4.736842e-01
    
    
      want
      840
      1.927458e-01
      1.578947e-01
    
    
      war
      840
      7.589925e-01
      6.842105e-01
    
    
      willingness
      840
      1.410162e-01
      1.052632e-01
    
    
      withdraw
      840
      8.759607e-01
      8.421053e-01
    
    
      withstood
      840
      2.353197e-01
      1.578947e-01
    
    
      work
      840
      6.638403e-01
      6.315789e-01
    
    
      write
      840
      6.304227e-01
      4.736842e-01
    
  

143 rows × 3 columns

Effect on CrowdTruth metrics

Finally, we can compare the effect of the transformation from an open task to a closed task on the CrowdTruth sentence quality score.



In [39]:

    
%matplotlib inline

import matplotlib
import matplotlib.pyplot as plt

plt.scatter(
    results["units"]["uqs"],
    results_closed["units"]["uqs"],
)
plt.plot([0, 1], [0, 1], 'red', linewidth=1)
plt.title("Sentence Quality Score")
plt.xlabel("open task")
plt.ylabel("closed task")









    Out[39]:





Text(0,0.5,u'closed task')



In [41]:

    
plt.scatter(
    results["workers"]["wqs"],
    results_closed["workers"]["wqs"],
)
plt.plot([0, 1], [0, 1], 'red', linewidth=1)
plt.title("Worker Quality Score")
plt.xlabel("open task")
plt.ylabel("closed task")









    Out[41]:





Text(0,0.5,u'closed task')

	_unit_id	_created_at	_id	_started_at	_tainted	_channel	_trust	_worker_id	_country	_region	...	events_count	original_sentence	processed_sentence	selectedtags_desc_gold	sentence	sentence_id	stanford_lemmas	stanford_pos_tags	tokens	validate_verbs
0	1883297207	8/31/2018 08:18:12	4019711384	8/31/2018 08:11:44	False	clixsense	1	6481150	AUS	8	...	4	Separately, Esselte Business Systems reported ...	Separately , Esselte Business Systems reported...	NaN	NaN	11	NaN	NaN	39	NaN
1	1883297207	8/31/2018 21:59:16	4021335631	8/31/2018 21:58:58	False	gifthunterclub	1	43861575	USA	NaN	...	4	Separately, Esselte Business Systems reported ...	Separately , Esselte Business Systems reported...	NaN	NaN	11	NaN	NaN	39	NaN
2	1883297214	8/30/2018 12:57:05	4016914193	8/30/2018 12:56:45	False	clixsense	1	31988217	GBR	P5	...	5	But some other parties and social organization...	But some other parties and social organization...	NaN	NaN	10	NaN	NaN	39	NaN
3	1883297214	8/31/2018 12:36:21	4020056124	8/31/2018 12:35:42	False	instagc	1	23503585	CAN	SK	...	5	But some other parties and social organization...	But some other parties and social organization...	NaN	NaN	10	NaN	NaN	39	NaN
4	1883297214	8/30/2018 15:00:13	4017194252	8/30/2018 14:59:21	False	prodege	1	11131207	CAN	ON	...	5	But some other parties and social organization...	But some other parties and social organization...	NaN	NaN	10	NaN	NaN	39	NaN

	output.selected_events	output.selected_events.count	output.selected_events.unique	submitted	started	worker	unit	duration	job
judgment
4019711384	{u'$ 10.1__129__135': 1, u'reported__38__46': ...	4	4	2018-08-31 08:18:12	2018-08-31 08:11:44	6481150	1883297207	388	../data/event-text-sparse-multiple-choice
4021335631	{u'$ 10.1__129__135': 1}	1	1	2018-08-31 21:59:16	2018-08-31 21:58:58	43861575	1883297207	18	../data/event-text-sparse-multiple-choice
4016914193	{u'accession__100__109': 1, u'bring__174__179'...	2	2	2018-08-30 12:57:05	2018-08-30 12:56:45	31988217	1883297214	20	../data/event-text-sparse-multiple-choice
4020056124	{u'accession__100__109': 1, u'bring__174__179'...	2	2	2018-08-31 12:36:21	2018-08-31 12:35:42	23503585	1883297214	39	../data/event-text-sparse-multiple-choice
4017194252	{u'accession__100__109': 1, u'claiming__80__88...	3	3	2018-08-30 15:00:13	2018-08-30 14:59:21	11131207	1883297214	52	../data/event-text-sparse-multiple-choice

	duration	input.doc_id	input.events	input.events_count	input.original_sentence	input.processed_sentence	input.sentence_id	input.tokens	job	output.selected_events	output.selected_events.annotations	output.selected_events.unique_annotations	worker	uqs	unit_annotation_score	uqs_initial	unit_annotation_score_initial
unit
1883297207	80.90	wsj_1033.tml	$ 10.1__129__135###reported__38__46###fell__72...	4	Separately, Esselte Business Systems reported ...	Separately , Esselte Business Systems reported...	11	39	../data/event-text-sparse-multiple-choice	{u'$ 9.5__86__91': 1, u'reported__38__46': 18,...	35	4	20	0.811172	{u'reported__38__46': 0.946635621845, u'$ 10.1...	0.731037	{u'reported__38__46': 0.9, u'$ 10.1__129__135'...
1883297208	62.70	APW19990607.0041.tml	purports__5__13###be__17__19###said__179__183#...	5	Kopp purports to be a devout Roman Catholic, a...	Kopp purports to be a devout Roman Catholic , ...	14	39	../data/event-text-sparse-multiple-choice	{u'no_event': 1, u'be__17__19': 4, u'said__179...	50	6	20	0.565027	{u'no_event': 0.0312696271625, u'be__17__19': ...	0.471512	{u'no_event': 0.05, u'be__17__19': 0.2, u'said...
1883297209	49.65	NYT19981025.0216.tml	protect__45__52###murdered__77__85###said__97_...	7	``We as Christians have a responsibility to pr...	`` We as Christians have a responsibility to p...	14	39	../data/event-text-sparse-multiple-choice	{u'want__177__181': 4, u'have__20__24': 4, u'm...	63	6	20	0.649039	{u'want__177__181': 0.246818574648, u'have__20...	0.561513	{u'want__177__181': 0.2, u'have__20__24': 0.2,...
1883297210	63.65	NYT19981026.0446.tml	opposed__170__177###followed__122__130###was__...	5	Slepian's death was among the first topics rai...	Slepian 's death was among the first topics ra...	16	39	../data/event-text-sparse-multiple-choice	{u'opposed__170__177': 8, u'followed__122__130...	48	5	20	0.614187	{u'opposed__170__177': 0.52740577667, u'follow...	0.518003	{u'opposed__170__177': 0.4, u'followed__122__1...
1883297211	39.60	NYT19981026.0446.tml	exploit__109__116###murder__133__139###said__2...	5	``It's possible that New York politics has nev...	`` It 's possible that New York politics has n...	43	39	../data/event-text-sparse-multiple-choice	{u'exploit__109__116': 12, u'murder__133__139'...	52	5	20	0.653157	{u'exploit__109__116': 0.705336530504, u'murde...	0.589756	{u'exploit__109__116': 0.6, u'murder__133__139...

	duration	job	judgment	unit	wqs	wwa	wsa	wqs_initial	wwa_initial	wsa_initial
worker
1883983	34.500000	1	6	6	0.794872	0.820860	0.968341	0.725088	0.756548	0.958417
3587109	11.000000	1	2	2	0.627709	0.749908	0.837048	0.513732	0.648339	0.792383
4316379	24.000000	1	3	3	0.514893	0.688889	0.747424	0.383277	0.559880	0.684569
6377879	64.666667	1	6	6	0.573349	0.695490	0.824381	0.498032	0.620033	0.803234
6481150	98.047619	1	42	42	0.728838	0.776141	0.939054	0.662856	0.710559	0.932866

	output.selected_events	output.selected_events.count	output.selected_events.unique	submitted	started	worker	unit	duration	job
judgment
4019711384	{u'$ 10.1': 1, u'report': 1, u'fall': 1, u'$ 9...	4	143	2018-08-31 08:18:12	2018-08-31 08:11:44	6481150	1883297207	388	data/event-text-sparse-multiple-choice
4021335631	{u'$ 10.1': 1, u'report': 0, u'fall': 0, u'$ 9...	1	143	2018-08-31 21:59:16	2018-08-31 21:58:58	43861575	1883297207	18	data/event-text-sparse-multiple-choice
4016914193	{u'accession': 1, u'bring': 1, u'$ 10.1': 0, u...	2	143	2018-08-30 12:57:05	2018-08-30 12:56:45	31988217	1883297214	20	data/event-text-sparse-multiple-choice
4020056124	{u'accession': 1, u'bring': 1, u'$ 10.1': 0, u...	2	143	2018-08-31 12:36:21	2018-08-31 12:35:42	23503585	1883297214	39	data/event-text-sparse-multiple-choice
4017194252	{u'accession': 1, u'claim': 1, u'bring': 1, u'...	3	143	2018-08-30 15:00:13	2018-08-30 14:59:21	11131207	1883297214	52	data/event-text-sparse-multiple-choice

	output.selected_events	aqs	aqs_initial
$ 10.1	840	3.819172e-02	5.263158e-02
$ 9.5	840	1.000000e-08	1.000000e-08
accession	840	8.321034e-01	6.842105e-01
add	840	1.936832e-01	2.105263e-01
analyze	840	8.530456e-01	7.894737e-01
announce	840	8.670445e-01	7.368421e-01
appear	840	2.689285e-01	2.105263e-01
approve	840	8.577278e-01	7.894737e-01
arrest	840	8.741370e-01	7.894737e-01
assassination	840	6.594177e-01	5.789474e-01
assume	840	4.833141e-01	4.210526e-01
attempt	840	6.151563e-01	4.824561e-01
barricade	840	6.589291e-01	5.789474e-01
be	840	3.914025e-01	3.840941e-01
become	840	4.191707e-01	3.932584e-01
believe	840	4.162580e-01	3.157895e-01
block	840	4.960013e-01	3.684211e-01
bogged	840	2.190501e-01	1.578947e-01
boost	840	7.905818e-01	7.368421e-01
bring	840	4.243983e-01	3.157895e-01
buildup	840	5.694491e-01	4.736842e-01
bury	840	6.571132e-01	5.263158e-01
call	840	4.532265e-01	3.684211e-01
camped	840	8.497850e-01	7.894737e-01
casualty	840	4.454066e-01	3.684211e-01
cause	840	3.774561e-01	2.631579e-01
change	840	8.329362e-01	7.894737e-01
claim	840	6.825544e-01	6.315789e-01
close	840	8.306430e-01	6.997085e-01
come	840	2.111922e-01	1.578947e-01
...	...	...	...
say	840	6.100785e-01	5.540057e-01
see	840	5.103717e-01	4.210526e-01
sent	840	6.743174e-01	5.263158e-01
settle	840	8.276415e-01	7.368421e-01
shot	840	9.240686e-01	8.421053e-01
stabilize	840	6.390495e-01	5.789474e-01
statement	840	1.440308e-01	1.052632e-01
suffer	840	6.074152e-01	5.263158e-01
suit	840	2.744135e-01	2.105263e-01
support	840	5.642443e-01	5.263158e-01
take	840	5.161874e-01	4.210526e-01
talk	840	7.083928e-01	5.789474e-01
target	840	2.147324e-01	2.631579e-01
thrust	840	6.053758e-01	5.263158e-01
tighten	840	5.397438e-01	4.736842e-01
told	840	4.597396e-01	3.684211e-01
transaction	840	6.720130e-01	5.789474e-01
treatment	840	4.764945e-01	4.394904e-01
truth	840	1.000000e-08	1.000000e-08
tumult	840	4.540955e-01	3.684211e-01
upheld	840	9.323898e-01	8.421053e-01
use	840	2.653169e-01	2.105263e-01
vindicate	840	5.528458e-01	4.736842e-01
want	840	1.927458e-01	1.578947e-01
war	840	7.589925e-01	6.842105e-01
willingness	840	1.410162e-01	1.052632e-01
withdraw	840	8.759607e-01	8.421053e-01
withstood	840	2.353197e-01	1.578947e-01
work	840	6.638403e-01	6.315789e-01
write	840	6.304227e-01	4.736842e-01